Intelligent backup and versioning

ABSTRACT

There is disclosed in one example a computing apparatus, including: a processor and a memory; a network interface to communicatively couple to a backup client; a storage to receive backup data from the client, including a plurality of versions and an associated reputation for each version, the associated reputation to indicate a probability that the version is valid; and instructions encoded within the memory to instruct the processor to: receive from the backup client a request to store a new version of the backup data; determine that the client has exceeded a backup threshold; identify a backup version having a lowest reputation for validity; and expunge the backup version having the lowest reputation for validity.

CROSS-REFERENCE TO RELATED APPLICATION

This application is a continuation of U.S. patent application Ser. No.15/382,787, filed on Dec. 19, 2016 and entitled “INTELLIGENT BACKUP ANDVERSIONING,” Inventors Igor Muttik, et al. The disclosure of the priorapplication is hereby incorporated by reference in its entirety in thedisclosure of this application.

FIELD OF THE SPECIFICATION

This disclosure relates in general to the field of computer security,and more particularly, though not exclusively, to a system and methodfor intelligent backup and versioning.

BACKGROUND

Ransomware is a denial of access attack on a computer system. In atypical ransomware attack, a malicious Trojan is installed on the targetmachine. The Trojan then locks down the computer, and demands a ransomfor unlocking it. An unsophisticated ransomware may use aneasily-circumvented attack, such as locking the user's account. But moreadvanced ransomware may actually encrypt the user's files, and demand aransom in exchange for the decryption key. As it may be impractical todecrypt the files without the key, and as the user's files may beimportant, many users pay the ransom.

BRIEF DESCRIPTION OF THE DRAWINGS

The present disclosure is best understood from the following detaileddescription when read with the accompanying figures. It is emphasizedthat, in accordance with the standard practice in the industry, variousfeatures are not necessarily drawn to scale, and are used forillustration purposes only. Where a scale is shown, explicitly orimplicitly, it provides only one illustrative example. In otherembodiments, the dimensions of the various features may be arbitrarilyincreased or reduced for clarity of discussion.

FIG. 1 is a block diagram of a security-enabled network according to oneor more examples of the present specification.

FIG. 2 is a block diagram of a computing device according to one or moreexamples of the present specification.

FIG. 3 is a block diagram of a server according to one or more examplesof the present specification.

FIG. 4 is a block diagram of a backup architecture according to one ormore examples of the present specification.

FIG. 5 is a block diagram illustration of backup versioning according toone or more examples of the present specification.

FIG. 6 is a flow chart of a method of intelligent backup according toone or more examples of the present specification.

FIG. 7 is a flow chart of a method of restoring a source according toone or more examples of the present specification.

SUMMARY

In an example, there is disclosed a computing apparatus, comprising: aprocessor and a memory; a network interface to communicatively couple toa backup client; a storage to receive backup data from the client,including a plurality of versions and an associated reputation for eachversion, the associated reputation to indicate a probability that theversion is valid; and instructions encoded within the memory to instructthe processor to: receive from the backup client a request to store anew version of the backup data; determine that the client has exceeded abackup threshold; identify a backup version having a lowest reputationfor validity; and expunge the backup version having the lowestreputation for validity.

EMBODIMENTS OF THE DISCLOSURE

The following disclosure provides many different embodiments, orexamples, for implementing different features of the present disclosure.Specific examples of components and arrangements are described below tosimplify the present disclosure. These are, of course, merely examplesand are not intended to be limiting. Further, the present disclosure mayrepeat reference numerals and/or letters in the various examples. Thisrepetition is for the purpose of simplicity and clarity and does not initself dictate a relationship between the various embodiments and/orconfigurations discussed. Different embodiments may have differentadvantages, and no particular advantage is necessarily required of anyembodiment.

Ransomware is a growing problem that can negatively impact bothindividual users and enterprises, and that can target a wide array ofdevices—from smartphones, tablets, and Ultrabooks®, to cloud serverfarms and high-performance computing (HPC).

An initial approach to combatting ransomware may include recognizingunwanted file changes via rules and heuristic methods. This behavioralapproach to ransomware is effective in some contexts, but may be limitedby the malware author's ability to learn the rules and createworkarounds to them. In certain embodiments, in addition to or insteadof such behavioral approaches, protected backups and/or filesystems maybe maintained, including an inventory or index of modifications made toeach file or object. This may be referred to as “backup versioning,” andeach modification may be assigned a version label. Throughout thisspecification, the file, folder, object, disk, or store being backed upmay be referred to as the “source.” The source should be broadlyunderstood to include any data object that can be backed up using themethods disclosed herein or equivalent methods. Depending on context,when restoring, the “source” may be anything from an individual filethat was lost to an entire filesystem of a crashed disk.

In each of the following examples and throughout this specification, thesource may be stored as a “destination object” in a “backupdestination.” The destination object may include, by way of non-limitingexample, a backup file or folder. The destination object may be storedon the same disk (e.g., a “.tar,” “.tgz,” “.zip,” “.iso,” “.img,” backupfolder, or backup file or folder), an external target such as an opticalmedium (CD/DVD/Blu-Ray), an external hard drive, a peer machine, a tapedrive, a dedicated backup server or appliance, a storage array such asredundant array of independent disks (RAID) or redundant array ofindependent nodes (RAIN), network attached storage (NAS), or athird-party or “cloud” service that may use any or all of the foregoing.Different backup approaches have different advantages and disadvantages.For example:

-   -   a. Full Backups—Most backup schemes start with a “full backup,”        in which all of the data in the source are copied en masse to a        destination object. This may be a single archival file or        mirrored folder, or it may be distributed in a RAID or RAIN        fashion. In some cases, particularly in the case of novice or        home users, the backup may simply be copied (such as by a “cron”        job overnight, or even manually when the user thinks about it)        to the backup destination, and each incremental backup comprises        a full backup of this type. The user may overwrite the backup        each time, or may keep a set number of backup copies (e.g., only        keep the three most recent backups). This type of backup scheme        is easy to setup and administer, and can be very easy to recover        from provided the desired backup is available and not        compromised (e.g., as long as the backup drive has not failed        and the backup is valid, just unzip the backup file into the        user's home directory). But this type of backup can also consume        huge amounts of storage, as the user may have changed only a        small percentage of files between each backup. Thus, each        individual copy of “backup.tgz” may include copies of thousands        upon thousands identical files, thus consuming much space with        little incremental benefit.    -   b. Distributed Backups—In more sophisticated systems, rather        than having a single stored copy of the source on the backup        destination, a large full backup may be maintained in a        distributed fashion on a RAID or RAIN system or similar. In that        case, the destination object may be divided into m slices, such        that for n<m, the destination object (and thus the source) can        be reconstructed with any n slices. Any type of destination        object (e.g., full, differential, incremental) may be stored in        a distributed backup.    -   c. Differential Backups—A differential backup includes a single        full backup, and a single cumulative backup of all changes since        the last full backup (a “delta”). In other words, the delta        tracks all changes since the last full backup. Each incremental        backup results in a new delta, and the previous delta may be        discarded (or kept as a previous version). This type of backup        is more efficient than continuous incremental backups because        the delta is often much smaller than the full backup. To        maintain several backup versions, several deltas are required,        each one representing a “snapshot” of the state of the source at        the time of the delta. Advantageously, differential backups        provide quick recovery times, as only the full backup and the        target delta are required to restore the source. However, as the        full backup ages, the delta generally grows larger as more files        are changed. Not only does this increase storage demands, but it        can also increase restoration time relative an incremental        backup.    -   d. Filesystem Level Incremental Backups—File level incremental        backups consider only those files that have been changed since        the last backup, and creates a delta from the last incremental        backup. Thus, each delta is dependent on the last delta, and to        correctly reconstruct the destination object (and thus the        source), each delta in the chain is required. This results in        somewhat greater complexity than a differential backup, and may        be considered more fragile in that a long chain of deltas may be        required rather than just one full backup and one delta.        However, daily incremental backups are generally relatively        small (a user may work on only a small number of files in a        given day), meaning that it is often practical to store these        with great redundancy. Furthermore, as the full backup object        ages and the number of changes increases, restoration from an        incremental backup may be faster and more efficient than        restoration from a single differential delta, which may have        grown quite large. One limitation of file level incremental        backups is that when large files are changed frequently, they        are backup up many times. For example, consider a user who        stores personal information such as bank accounts, social        security numbers, and critical business data in a single large,        encrypted virtual container. Because the file system (and thus        the backup software) may treat that container as a single file,        any time a file is changed, the whole container is backed up.        The same issue may be encountered with users who author large        files, such as video authoring.    -   e. File Versioning Backups—Utilities such as “Git,” “CVS,”        “Subversion,” and many others keep file level incremental        backups. In these cases, the delta is kept not at the filesystem        level, but at the file level. This type of versioning is often        used with text-based files, such as program source or assembly        files, or text documents. Although binary deltas are also        possible, they are more difficult and computationally-intensive        to maintain. The advantage of file versioning is that common        portions of a large file need not be stored numerous times (as        in the filesystem level backup). Rather, the delta includes        changes to the file itself. Note that differential (versus        incremental) file versioning is also possible, though it is less        common.

As suitable to a particular embodiment, one of the foregoing methods (orany other suitable method) may be used to create a protected backup withversioning (where “versioning” may be incremental, differential,filesystem level, file level, etc.). This may allow restoration of dataeven if ransomware compromises the system and applies encryption ordeletes original objects. The source may be restored to the last versionbefore the encryption or deletion occurred.

In a perfect theoretical framework, an unlimited number of backupversions can be retained. If a source is compromised by ransomware, orby any other data loss event (such as, by way of non-limiting example, ahard drive failure, accidental deletion, major changes that need to be“rolled back,” or accidental overwrite), the backup can be rolled backto the last “good” version without the flaw. However, in practice,computing resources are limited, and it may not be practical to retainunlimited backup versions. If versioned backups are storedindiscriminately, certain difficulties arise. For example:

-   -   a. The space required may grow quickly.    -   b. Many cloud backup services bill their service as “unlimited.”        However, storing unlimited versioned user data (instead a        single, unversioned destination object) stresses the        infrastructure of the cloud provider, posing scalability and        cost concerns.    -   c. When a ransomware attack does occur, it may be difficult for        the user to find the last “good” (uncompromised) version.    -   d. Indiscriminate versioning provides only limited protection        against malware that deliberately performs “data diddling”        (making small and infrequent modifications to the data that        ultimately result in corruption). This may be for the purpose of        making it difficult to detect changes. While data diddling may,        in some cases, be less destructive than encrypting, it can also        be harder to detect. Making subtle changes to a document (e.g.,        changing a few words here and there) could cause serious        problems depending on the importance of the document, and may be        difficult to fix without a complete review of the document.

As a compromise solution, many cloud backup services offer some limitedversioning. For example, a service may store rolling incremental backupsthat go back 30 days. While this provides some protection, there aresituations where it fails to mitigate a ransomware attack. For example,the user may have a file that is critical when he needs to access it,but that he does not access frequently, such as a tax return. The usermay not access tax return files on a regular basis, but if he isaudited, it may be critical for him to gain access to them. If the useris unaware of the ransomware attack, a corrupted or encrypted version ofhis tax data may be stored on the hard drive, and backed up to his cloudservice. When the user goes to access the data, he may find that it hasbeen compromised, and he is far past the 30-day window in which toaccess a last good (uncompromised) copy of the data.

Storage of versioned backups can be improved by introducing a size-based(rather than time-based) quota with discriminate version selection. Whena user reaches the storage quota, rather than purging based on a rollingtime window, the storage server purges the least reputable (i.e., leastlikely to be valuable) version of the backup, thus freeing up space andstaying under quota. On the other hand, if data are compromised, thereputation can be used to select the version most likely to contain auseful backup (i.e., before the ransomware attack, or other data lossevent).

The reputation may be determined in context, as well as from a deltabetween the current destination object and a new incoming destinationobject. In one embodiment, a relatively high delta may indicate apotential threat, and may result in a backup version with a relativelylower reputation, while a relatively low delta may indicate alikely-good backup with a relatively higher reputation (i.e., themagnitude of the delta is inversely proportional to the reputation).

Note that in this embodiment, a larger scalar number represents a“better” reputation, but this is a non-limiting example. In otherembodiments, a smaller number could represent a better reputation. Inyet other embodiments, the reputation could be a multi-faceted vector ormatrix, with numerous fields that could encode various factors, such ascontextual data. In that case, the absolute magnitude of the reputationmay not be the only thing considered. Rather, depending on the context,different fields in the reputation could have greater or lesser sway.

In one embodiment, an intelligent backup engine assigns a reputationcomprises:

-   -   a. A local backup agent requests a new backup. (Note that        alternately, a cloud-based backup server could “pull” the backup        from the local device).    -   b. The intelligent backup engine determines whether new data are        being created (i.e., this is a new destination object), or        whether existing data are being updates (resulting in a delta).    -   c. For updates, the intelligent backup engine retrieves the        context and/or reputation for the previous version.    -   d. The storage engine checks the quota, which may be assigned to        the user, the user's machine, the disk, or this destination        object specifically, by way of non-limiting example.    -   e. If the quota has been reached, the intelligent backup engine        removes the backup with the lowest reputation, thus preserving        the quota requirement without removing high reputation backups.    -   f. In some embodiments, rather than completely removing lower        reputation versions, the intelligent backup engine may move that        version to a less expensive backup medium. For example, if the        backup is local, the low-reputation backup may be moved to an        off-site cloud storage. Or the low-reputation version may be        moved from an active backup server (RAID, RAIN, NAS, etc.) to an        archival storage (e.g., a tape archive) that is less easily        accessible. This may be referred to as “demoting” that backup        version (versus “removing” or “deleting” the backup).    -   g. In some embodiments, the intelligent backup engine may also        log which backup version or versions were demoted or removed,        and as appropriate, may log data on accessing the demoted        version. For example, the log may include a uniform resource        locator (URL) for accessing an off-site or cloud backup, or may        include a serial number and date for locating a specific tape        where the version was stored.    -   h. The intelligent backup engine stores a new version (e.g.,        delta information), along with the context. The context may        include, for example, a structure comprising, by way of        non-limiting example, {version, link to the previous version,        data source, reputation of the software that initiated the        backup, timestamp, writing pattern, entropy, history, comparison        with other contexts and/or history, and user presence}.    -   i. Certain embodiments may also store the computed reputation.    -   j. The intelligent backup engine completes the backup operation.

In certain embodiments, the analysis engine assigns the backup version areputation based on one or more of the following contextual parameters:

-   -   a. Analysis of the entropy of the new backup data when compared        to a reference entropy figure for the file type. For example,        “Word” documents may have a reference entropy figure.        Substantial deviation from this reference entropy (especially        significantly greater entropy) may be a strong indicator that        the file has been encrypted.    -   b. The timing of the backup may be compared to a pre-shared        backup schedule. For example, unscheduled backups during        non-office hours may be suspicious.    -   c. The backup write pattern may be analyzed. Is it one short        burst or a long one? Is it one file or many? Does it match        previous write patterns or is it out of the ordinary?    -   d. User presence may influence reputation. For example, if the        user is present and initiates the backup from trusted backup        software, the backup may have a higher reputation than a backup        that was initiated automatically (especially off-schedule) by an        unknown or untrusted backup software. This could also include        looking to see whether a remote session was active, and whether        it was secure and/or trusted.    -   e. The global reputation of the software that initiated the        backup may be considered. For example, MCAFEE, LLC's Global        Threat Intelligence (GTI™) database may be queried to retrieve a        global signature and reputation. If the application conforms to        the signature and has a high reputation, it may be more trusted        than an unknown application, or one without a good reputation.    -   f. The presence of a trust operation may also be considered. For        example, if the backup operation was confirmed out-of-band by        the user (e.g., via a code sent to the user via SMS)    -   g. The (filesystem level) delta between incremental versions may        be considered. For example, backups with low deltas may have        relatively lower reputations than versions with higher deltas,        because only a few files were changed. This may be indicative of        malware gradually changing only one or two files at a time        (“data diddling”). This factor could also be exacerbated if, for        example, one or two infrequently-accessed files are changed at a        time, with many such files changing over time. In some        embodiments, data diddling may be detected heuristically. For        example, data diddling may include modifying portions of a        document that users do not frequently edit, such as header data,        metadata, or other hidden data. Small changes to such data may        be deemed suspicious.    -   h. The intelligent backup engine may correlate backup activity        across multiple users and detect anomalies via rules or machine        learning methods (e.g., if many users have out-of-schedule        backup with a similar pattern, it may indicate the same        ransomware attacked multiple targets).    -   i. At the file level, the size of the delta between the current        file and current backup. Small deltas may be expected for some        file types (e.g., plain text configuration files). Other file        types may have much larger deltas (for example, although they        are “text,” Microsoft Word documents are stored internally as        binary “.zip” files, which may result in relatively large file        level changes). Thus, if a file has a large delta, this may        indicate that the current backup needs a high reputation to        preserve it.    -   j. Structured file types may also be a factor in assigning a        reputation. For example, either ransomware or a file write error        could be responsible for a change in data that is not conformant        to the data structure. If a write to a “.zip” file results in a        corruption that renders the final form of the file invalid as a        .zip file, then it may be advantageous to retain the last good        and usable version of the file. This is true whether the write        was the result of malicious activity, a filesystem error, or        user error.

In certain embodiments, the intelligent backup engine performs “garbagecollection” in several ways:

-   -   a. When the quota is reached, the engine removes backup entries        with the lowest reputation. The intelligent backup engine may        proactively avoid quota violation by automatically managing        space in the background (e.g., when the cloud provider        experiences lower loads) by deleting low-reputation backup or        moving backups to other media.    -   b. Certain embodiments may be configured so that the intelligent        backup engine keeps the highest delta versions of different        files in all cases. For example in the case of t.txt and t.zip,        both copies would be retained (as they have dramatically        different content), but incremental old versions of t.txt that        differ slightly may be purged more often.    -   c. In cases of file deletion (rather than overwrite), the last        deleted version of a file may naturally be assigned a high        reputation (purge less often or never).

A system and method for intelligent backup and versioning will now bedescribed with more particular reference to the attached FIGURES. Itshould be noted that throughout the FIGURES, certain reference numeralsmay be repeated to indicate that a particular device or block is whollyor substantially consistent across the FIGURES. This is not, however,intended to imply any particular relationship between the variousembodiments disclosed. In certain examples, a genus of elements may bereferred to by a particular reference numeral (“widget 10”), whileindividual species or examples of the genus may be referred to by ahyphenated numeral (“first specific widget 10-1” and “second specificwidget 10-2”).

FIG. 1 is a network-level diagram of a secured enterprise 100 accordingto one or more examples of the present specification. In the example ofFIG. 1, secured enterprise 100 may be configured to provide services ordata to one or more customers 162, who may access information orservices via external network 172. This may require secured enterprise100 to at least partly expose certain services and networks to theoutside world, thus creating a logical security aperture.

Within secured enterprise 100, one or more users 120 operate one or moreclient devices 110. Each device may include an appropriate operatingsystem, such as Microsoft Windows, Linux, Android, Mac OSX, Apple iOS,Unix, or similar. Some of the foregoing may be more often used on onetype of device than another. For example, desktop computers orengineering workstation may be more likely to use one of MicrosoftWindows, Linux, Unix, or Mac OSX. Laptop computers, which are usually aportable off-the-shelf device with fewer customization options, may bemore likely to run Microsoft Windows or Mac OSX. Mobile devices may bemore likely to run Android or iOS. However, these examples are notintended to be limiting.

Client devices 110 may be communicatively coupled to one another and toother network resources via enterprise network 170. Enterprise network170 may be any suitable network or combination of one or more networksoperating on one or more suitable networking protocols, including forexample, a local area network, an intranet, a virtual network, a widearea network, a wireless network, a cellular network, or the internet(optionally accessed via a proxy, virtual machine, or other similarsecurity mechanism) by way of non-limiting example. Enterprise network170 may also include one or more servers, firewalls, routers, switches,security appliances, antivirus servers, or other useful network devices,which in an example may be virtualized within workload cluster 142. Inthis illustration, enterprise network 170 is shown as a single networkfor simplicity, but in some embodiments, enterprise network 170 mayinclude a large number of networks, such as one or more enterpriseintranets connected to the internet. Enterprise network 170 may alsoprovide access to an external network, such as the Internet, viaexternal network 172. External network 172 may similarly be any suitabletype of network.

A workload cluster 142 may be provided, for example as a virtual clusterrunning in a hypervisor on a plurality of rack-mounted blade servers, oras a cluster of physical servers. Workload cluster 142 may provide oneor more server functions, or one or more “microclouds” in one or morehypervisors. For example, a virtualization environment such as vCentermay provide the ability to define a plurality of “tenants,” with eachtenant being functionally separate from each other tenant, and eachtenant operating as a single-purpose microcloud. Each microcloud mayserve a distinctive function, and may include a plurality of virtualmachines (VMs) of many different flavors, including agentful andagentless VMs.

It should also be noted that some functionality of endpoint devices 110may also be provided via workload cluster 142. For example, onemicrocloud may provide a remote desktop hypervisor such as a Citrixworkspace, which allows users 120 operating endpoints 110 to remotelylogin to a remote enterprise desktop and access enterprise applications,workspaces, and data. In that case, endpoint 110 could be a “thinclient” such as a Google Chromebook, running only a stripped-downoperating system, and still provide user 120 useful access to enterpriseresources.

One or more computing devices configured as a management console 140 mayalso operate on enterprise network 170. Management console 140 mayprovide a user interface for a security administrator 150 to defineenterprise security policies, which management console 140 may enforceon enterprise network 170 and across client devices 110 and workloadcluster 142. In an example, management console 140 may run aserver-class operating system, such as Linux, Unix, or Windows Server.In other case, management console 140 may be provided as a webinterface, on a desktop-class machine, or via a VM provisioned withinworkload cluster 142.

Secured enterprise 100 may encounter a variety of “security objects” onthe network. A security object may be any object that operates on orinteracts with enterprise network 170 and that has actual or potentialsecurity implications. In one example, security objects may be broadlydivided into hardware objects, including any physical device thatcommunicates with or operates via the network, and software objects.Software objects may be further subdivided as “executable objects” and“static objects.” Executable objects include any object that canactively execute code or operate autonomously, such as applications,drivers, programs, executables, libraries, processes, runtimes, scripts,macros, binaries, interpreters, interpreted language files,configuration files with inline code, embedded code, and firmwareinstructions by way of non-limiting example. A static object may bebroadly designated as any object that is not an executable object orthat cannot execute, such as documents, pictures, music files, textfiles, configuration files without inline code, videos, and drawings byway of non-limiting example. In some cases, hybrid software objects mayalso be provided, such as for example a word processing document withbuilt-in macros or an animation with inline code. For security purposes,these may be considered as a separate class of software object, or maysimply be treated as executable objects.

Secured enterprise 100 may communicate across enterprise boundary 104with external network 172. Enterprise boundary 104 may represent aphysical, logical, or other boundary. External network 172 may include,for example, websites, servers, network protocols, and othernetwork-based services. In one example, an application repository 160 isavailable via external network 172, and an attacker 180 (or othersimilar malicious or negligent actor) also connects to external network172. A security services provider 190 may provide services to securedenterprise 100.

It may be a goal of users 120 and secure enterprise 100 to successfullyoperate client devices 110 and workload cluster 142 without interferencefrom attacker 180 or from unwanted security objects. In one example,attacker 180 is a malware author whose goal or purpose is to causemalicious harm or mischief, for example by injecting malicious object182 (which may be, for example, malware) into client device 110. Oncemalicious object 182 gains access to client device 110, it may try toperform work such as “data diddling,” encrypting important files,corrupting files, or other malicious activity such as social engineeringof user 120, a hardware-based attack on client device 110, modifyingstorage 350 (FIG. 3), modifying client application 112 (which may berunning in memory), or gaining access to enterprise servers 142.

The malicious harm or mischief may take the form of installing root kitsor other malware on client devices 110 to tamper with the system,installing spyware or adware to collect personal and commercial data,defacing websites, operating a botnet such as a spam server, or simplyto annoy and harass users 120. Thus, one aim of attacker 180 may be toinstall his malware on one or more client devices 110. As usedthroughout this specification, malicious software (“malware”) includesany security object configured to provide unwanted results or dounwanted work. In many cases, malware objects will be executableobjects, including by way of non-limiting examples, viruses, Trojans,zombies, rootkits, backdoors, worms, spyware, adware, ransomware,dialers, payloads, malicious browser helper objects, tracking cookies,loggers, or similar objects designed to take a potentially-unwantedaction, including by way of non-limiting example data destruction,covert data collection, browser hijacking, network proxy or redirection,covert tracking, data logging, keylogging, excessive or deliberatebarriers to removal, contact harvesting, and unauthorizedself-propagation.

Attacker 180 may also want to commit industrial or other espionageagainst secured enterprise 100, such as stealing classified orproprietary data, stealing identities, or gaining unauthorized access toenterprise resources. Thus, attacker 180's strategy may also includetrying to gain physical access to one or more client devices 110 andoperating them without authorization, so that an effective securitypolicy may also include provisions for preventing such access.

In another example, a software developer may not explicitly havemalicious intent, but may develop software that poses a security risk.For example, a well-known and often-exploited security flaw is theso-called buffer overrun, in which a malicious user is able to enter anoverlong string into an input form and thus gain the ability to executearbitrary instructions or operate with elevated privileges on acomputing device. Buffer overruns may be the result, for example, ofpoor input validation or use of insecure libraries, and in many casesarise in nonobvious contexts. Thus, although not malicious, a developercontributing software to application repository 160 may inadvertentlyprovide attack vectors for attacker 180. Poorly-written applications mayalso cause inherent problems, such as crashes, data loss, or otherundesirable behavior. Because such software may be desirable itself, itmay be beneficial for developers to occasionally provide updates orpatches that repair vulnerabilities as they become known. However, froma security perspective, these updates and patches are essentially newobjects that must themselves be validated.

Application repository 160 may represent a Windows or Apple “App Store”or update service, a Unix-like repository or ports collection, or othernetwork service providing users 120 the ability to interactively orautomatically download and install applications on client devices 110.If application repository 160 has security measures in place that makeit difficult for attacker 180 to distribute overtly malicious software,attacker 180 may instead stealthily insert vulnerabilities intoapparently-beneficial applications.

In some cases, secured enterprise 100 may provide policy directives thatrestrict the types of applications that can be installed fromapplication repository 160. Thus, application repository 160 may includesoftware that is not negligently developed and is not malware, but thatis nevertheless against policy. For example, some enterprises restrictinstallation of entertainment software like media players and games.Thus, even a secure media player or game may be unsuitable for anenterprise computer. Security administrator 150 may be responsible fordistributing a computing policy consistent with such restrictions andenforcing it on client devices 110.

Secured enterprise 100 may also contract with or subscribe to a securityservices provider 190, which may provide security services, updates,antivirus definitions, patches, products, and services. MCAFEE, LLC is anon-limiting example of such a security services provider that offerscomprehensive security and antivirus solutions. In some cases, securityservices provider 190 may include a threat intelligence capability suchas the global threat intelligence (GTI™) database provided by MCAFEE,LLC. Security services provider 190 may update its threat intelligencedatabase by analyzing new candidate malicious objects as they appear onclient networks and characterizing them as malicious or benign.

In another example, secured enterprise 100 may simply be a family, withparents assuming the role of security administrator 150. The parents maywish to protect their children from undesirable content, such aspornography, adware, spyware, age-inappropriate content, advocacy forcertain political, religious, or social movements, or forums fordiscussing illegal or dangerous activities, by way of non-limitingexample. In this case, the parent may perform some or all of the dutiesof security administrator 150.

When a new object is first encountered on the network, security policiesmay initially treat it as “gray” or “suspect.” As a first line ofdefense, a security appliance in cluster 142 may query security servicesprovider 190 to see if the new object has a globally-recognizedreputation. If so, a local reputation may be generated based on thatglobal reputation. If not, the object is completely new and may betreated as a “candidate malicious object,” meaning that its status isunknown, and it may therefore be a malicious object. At a minimum, thenew object may be proscribed in its access to protected resources untilits reputation can be established. This may mean that extra permissionfrom a user 120 or security administrator 150 is required for thecandidate malicious object to access protected resources.

The candidate malicious object may also be subjected to additionalrigorous security analysis, particularly if it is a new object with noglobal reputation, or if it is an executable object. This may include,for example, submitting the object to an internal security audit, or tosecurity services provider 190, for deep analysis. This may includerunning the object in a sandbox environment, expert status analysis, orother security techniques. These may help to establish a new reputationfor the object.

If the object is permitted to operate on the network and maliciousbehavior is observed, the object may be tagged as malicious object 182.Remedial action may then be taken as appropriate or necessary. Thus, itis a goal of users 120 and security administrator 150 to configure andoperate client devices 110, workload cluster 142, and enterprise network170 so as to exclude all malicious objects, and to promptly andaccurately classify candidate malicious objects.

FIG. 2 is a block diagram of client device 200 according to one or moreexamples of the present specification. Client device 200 may be anysuitable computing device. In various embodiments, a “computing device”may be or comprise, by way of non-limiting example, a computer,workstation, server, mainframe, virtual machine (whether emulated or ona “bare-metal” hypervisor), embedded computer, embedded controller,embedded sensor, personal digital assistant, laptop computer, cellulartelephone, IP telephone, smart phone, tablet computer, convertibletablet computer, computing appliance, network appliance, receiver,wearable computer, handheld calculator, or any other electronic,microelectronic, or microelectromechanical device for processing andcommunicating data. Any computing device may be designated as a host onthe network. Each computing device may refer to itself as a “localhost,” while any computing device external to it may be designated as a“remote host.”

In certain embodiments, client devices 110 may all be examples of aclient device 200.

Client device 200 includes a processor 210 connected to a memory 220,having stored therein executable instructions for providing an operatingsystem 222 and at least software portions of a backup agent 224. Othercomponents of client device 200 include a storage 250, network interface260, and peripheral interface 240. This architecture is provided by wayof example only, and is intended to be non-exclusive and non-limiting.Furthermore, the various parts disclosed are intended to be logicaldivisions only, and need not necessarily represent physically separatehardware and/or software components. Certain computing devices providemain memory 220 and storage 250, for example, in a single physicalmemory device, and in other cases, memory 220 and/or storage 250 arefunctionally distributed across many physical devices. In the case ofVMs or hypervisors, all or part of a function may be provided in theform of software or firmware running over a virtualization layer toprovide the disclosed logical function. In other examples, a device suchas a network interface 260 may provide only the minimum hardwareinterfaces necessary to perform its logical operation, and may rely on asoftware driver to provide additional necessary logic. Thus, eachlogical block disclosed herein is broadly intended to include one ormore logic elements configured and operable for providing the disclosedlogical operation of that block. As used throughout this specification,“logic elements” may include hardware, external hardware (digital,analog, or mixed-signal), software, reciprocating software, services,drivers, interfaces, components, modules, algorithms, sensors,components, firmware, microcode, programmable logic, or objects that cancoordinate to achieve a logical operation.

In an example, processor 210 is communicatively coupled to memory 220via memory bus 270-3, which may be for example a direct memory access(DMA) bus by way of example, though other memory architectures arepossible, including ones in which memory 220 communicates with processor210 via system bus 270-1 or some other bus. Processor 210 may becommunicatively coupled to other devices via a system bus 270-1. As usedthroughout this specification, a “bus” includes any wired or wirelessinterconnection line, network, connection, bundle, single bus, multiplebuses, crossbar network, single-stage network, multistage network orother conduction medium operable to carry data, signals, or powerbetween parts of a computing device, or between computing devices. Itshould be noted that these uses are disclosed by way of non-limitingexample only, and that some embodiments may omit one or more of theforegoing buses, while others may employ additional or different buses.

In various examples, a “processor” may include any combination of logicelements operable to execute instructions, whether loaded from memory,or implemented directly in hardware, including by way of non-limitingexample a microprocessor, digital signal processor, field programmablegate array, graphics processing unit, programmable logic array,application specific integrated circuit, or virtual machine processor.In certain architectures, a multi-core processor may be provided, inwhich case processor 210 may be treated as only one core of a multi-coreprocessor, or may be treated as the entire multi-core processor, asappropriate. In some embodiments, one or more co-processor may also beprovided for specialized or support functions.

Processor 210 may be connected to memory 220 in a DMA configuration viaDMA bus 270-3. To simplify this disclosure, memory 220 is disclosed as asingle logical block, but in a physical embodiment may include one ormore blocks of any suitable volatile or non-volatile memory technologyor technologies, including for example DDR random access memory (RAM),SRAM, DRAM, cache, L1 or L2 memory, on-chip memory, registers, flash,read only memory (ROM), optical media, virtual memory regions, magneticor tape memory, or similar. In certain embodiments, memory 220 maycomprise a relatively low-latency volatile main memory, while storage250 may comprise a relatively higher-latency non-volatile memory.However, memory 220 and storage 250 need not be physically separatedevices, and in some examples may represent simply a logical separationof function. It should also be noted that although DMA is disclosed byway of non-limiting example, DMA is not the only protocol consistentwith this specification, and that other memory architectures areavailable.

Storage 250 may be any species of memory 220, or may be a separatedevice. Storage 250 may include one or more non-transitorycomputer-readable mediums, including by way of non-limiting example, ahard drive, solid-state drive, external storage, RAID, NAS, opticalstorage, tape drive, backup system, cloud storage, or any combination ofthe foregoing. Storage 250 may be, or may include therein, a database ordatabases or data stored in other configurations, and may include astored copy of operational software such as operating system 222 andsoftware portions of backup agent 224. Many other configurations arealso possible, and are intended to be encompassed within the broad scopeof this specification.

Network interface 260 may be provided to communicatively couple clientdevice 200 to a wired or wireless network. A “network,” as usedthroughout this specification, may include any communicative platformoperable to exchange data or information within or between computingdevices, including by way of non-limiting example, an ad-hoc localnetwork, an internet architecture providing computing devices with theability to electronically interact, a plain old telephone system (POTS),which computing devices could use to perform transactions in which theymay be assisted by human operators or in which they may manually keydata into a telephone or other suitable electronic equipment, any packetdata network (PDN) offering a communications interface or exchangebetween any two nodes in a system, or any local area network (LAN),metropolitan area network (MAN), wide area network (WAN), wireless localarea network (WLAN), virtual private network (VPN), intranet, or anyother appropriate architecture or system that facilitates communicationsin a network or telephonic environment.

Backup agent 224, in one example, is operable to carry outcomputer-implemented methods as described in this specification. Backupagent 224 may include one or more tangible non-transitorycomputer-readable mediums having stored thereon executable instructionsoperable to instruct a processor to provide a backup agent 224. As usedthroughout this specification, an “engine” includes any combination ofone or more logic elements, of similar or dissimilar species, operablefor and configured to perform one or more methods provided by theengine. Thus, backup agent 224 may comprise one or more logic elementsconfigured to provide methods as disclosed in this specification. Insome cases, backup agent 224 may include a special integrated circuitdesigned to carry out a method or a part thereof, and may also includesoftware instructions operable to instruct a processor to perform themethod. In some cases, backup agent 224 may run as a “daemon” process. A“daemon” may include any program or series of executable instructions,whether implemented in hardware, software, firmware, or any combinationthereof that runs as a background process, a terminate-and-stay-residentprogram, a service, system extension, control panel, bootup procedure,BIOS subroutine, or any similar program that operates without directuser interaction. In certain embodiments, daemon processes may run withelevated privileges in a “driver space” associated with ring 0, 1, or 2in a protection ring architecture. It should also be noted that backupagent 224 may also include other hardware and software, includingconfiguration files, registry entries, and interactive or user-modesoftware by way of non-limiting example.

In one example, backup agent 224 includes executable instructions storedon a non-transitory medium operable to perform a method according tothis specification. At an appropriate time, such as upon booting clientdevice 200 or upon a command from operating system 222 or a user 120,processor 210 may retrieve a copy of the instructions from storage 250and load it into memory 220. Processor 210 may then iteratively executethe instructions of backup agent 224 to provide the desired method.

Note that in this embodiment, backup agent 224 is shown as a client-sideapplication running in main memory, while a separate backup server 334is shown in FIG. 3. This should be understood as a non-limiting examplethat illustrates only one of many possible configurations. In thisparticular example, backup agent 224 (client-side) may be a backupsolution such as Acronis, Synology, or some other software. Backup agent224 may also include a client engine for a cloud-based backup service,such as Carbonite, CrashPlan, or similar.

The “intelligent” aspect of backup and versioning may reside, in wholeor in part, in backup agent 224, backup server 324, or both. Thus, in ageneral sense, the intelligent backup engine of this specification couldinclude one or both of these, or some other similar structure.

Peripheral interface 240 may be configured to interface with anyauxiliary device that connects to client device 200 but that is notnecessarily a part of the core architecture of client device 200. Aperipheral may be operable to provide extended functionality to clientdevice 200, and may or may not be wholly dependent on client device 200.In some cases, a peripheral may be a computing device in its own right.Peripherals may include input and output devices such as displays,terminals, printers, keyboards, mice, modems, data ports (e.g., serial,parallel, USB, Firewire, or similar), network controllers, opticalmedia, external storage, sensors, transducers, actuators, controllers,data acquisition buses, cameras, microphones, speakers, or externalstorage by way of non-limiting example.

In one example, peripherals include display adapter 242, audio driver244, and input/output (I/O) driver 246. Display adapter 242 may beconfigured to provide a human-readable visual output, such as acommand-line interface (CLI) or graphical desktop such as MicrosoftWindows, Apple OSX desktop, or a Unix/Linux X Window System-baseddesktop. Display adapter 242 may provide output in any suitable format,such as a coaxial output, composite video, component video, VGA, ordigital outputs such as digital video interface (DVI) or high definitionmultimedia interface (HDMI), by way of non-limiting example. In someexamples, display adapter 242 may include a hardware graphics card,which may have its own memory and its own graphics processing unit(GPU). Audio driver 244 may provide an interface for audible sounds, andmay include in some examples a hardware sound card. Sound output may beprovided in analog (such as a 3.5 mm stereo jack), component (“RCA”)stereo, or in a digital audio format such as S/PDIF, AES3, AES47, HDMI,USB, Bluetooth or Wi-Fi audio, by way of non-limiting example.

FIG. 3 is a block diagram of a server-class device 300 according to oneor more examples of the present specification. Server 300 may be anysuitable computing device, as described in connection with

FIG. 2. In general, the definitions and examples of FIG. 2 may beconsidered as equally applicable to FIG. 3, unless specifically statedotherwise. Server 300 is described herein separately to illustrate thatin certain embodiments, logical operations according to thisspecification may be divided along a client-server model, wherein clientdevice 200 provides certain localized tasks, while server 300 providescertain other centralized tasks. In contemporary practice, server 300 ismore likely than client device 200 to be provided as a “headless” VMrunning on a computing cluster, or as a standalone appliance, thoughthese configurations are not required.

Server 300 includes a processor 310 connected to a memory 320, havingstored therein executable instructions for providing an operating system322 and at least software portions of a backup server 324. Othercomponents of server 300 include a storage 350, network interface 360,and peripheral interface 340. As described in FIG. 2, each logical blockmay be provided by one or more similar or dissimilar logic elements.

In an example, processor 310 is communicatively coupled to memory 320via memory bus 370-3, which may be for example a DMA bus. Processor 310may be communicatively coupled to other devices via a system bus 370-1.

Processor 310 may be connected to memory 320 in a DMA configuration viaDMA bus 370-3, or via any other suitable memory configuration. Asdiscussed in FIG. 2, memory 320 may include one or more logic elementsof any suitable type.

Storage 350 may be any species of memory 320, or may be a separatedevice, as described in connection with storage 250 of FIG. 2. Storage350 may be, or may include therein, a database or databases or datastored in other configurations, and may include a stored copy ofoperational software such as operating system 322 and software portionsof backup server 324.

Network interface 360 may be provided to communicatively couple server300 to a wired or wireless network, and may include one or more logicelements as described in FIG. 2.

Backup server 324 is an engine as described in FIG. 2 and, in oneexample, includes one or more logic elements operable to carry outcomputer-implemented methods as described in this specification.Software portions of backup server 324 may run as a daemon process.

Backup server 324 may include one or more non-transitorycomputer-readable mediums having stored thereon executable instructionsoperable to instruct a processor to provide backup server 324. At anappropriate time, such as upon booting server 300 or upon a command fromoperating system 322 or a user 120 or security administrator 150,processor 310 may retrieve a copy of backup server 324 (or softwareportions thereof) from storage 350 and load it into memory 320.Processor 310 may then iteratively execute the instructions of backupserver 324 to provide the desired method.

FIG. 4 is a block diagram illustrating one non-limiting example of aclient-server backup model. In the example of FIG. 4, a client device200 includes an attached storage 250, which may contain source data thatmay be backed up for data preservation purposes. In this case, anoperator of client device 200 contracts with a backup service 400 toprovide off-site backup services. For example, a backup agent 224 onclient device 200 may backup data from storage 250 to backup service 400via network 172. The backup may occur on a continuous basis (e.g., anytime a file is committed to disk), or it could be a regularly-scheduledbackup, such as a nightly backup.

Client device 200 sends backup data to backup service 402, whichprovides the backup data to a data center 404. In this example, datacenter 404 includes a storage controller 406, having attached thereto astorage 350. Storage controller 406 may be a node in a RAIN array, or itmay have several disks attached to it in a RAID configuration. Manyother configurations are possible, and are intended to be includedwithin the scope of this specification.

When client device 200 provides backup data to backup service 400 vianetwork 172, storage controller 406 may operate backup server 324 toanalyze the data and assign the backup a reputation according to themethods disclosed herein. If a quota has been reached, then based on thereputation that is assigned to the backup version, as well as thereputations that are assigned to previous backup versions, one or morebackup versions may be demoted, such as being erased or moved to a lessexpensive storage solution.

Note that this model is only one of many possible models that arecompatible with the methods disclosed herein. In other embodiments,client device 200 could have attached thereto a separate storage that isprovided for backup purposes. In that case, the intelligent backupengine may be embodied within backup agent 224, and backup service 402may not be involved. In yet other embodiments, a hybrid solution may beused, in which client device 200 keeps local backups, and also sendsoff-site backups to backup service 402. In that case, the intelligentbackup engine may be embodied both within storage controller 406 andclient device 200.

FIG. 5 is a block diagram of backup versioning according to one or moreexamples of the present specification. This example discloses backupchains 500 and 502 that illustrate certain aspects of the presentspecification.

In the example of backup chain 500, version 1 is the initial version inwhich a full backup is performed. Because this is an initial fullbackup, it may receive a default reputation, such as 20. No changes areembodied in this backup by definition. Note that in this embodiment, thereputation is a scalar value that varies directly with the “goodness” ofthe backup. Other embodiments are possible.

Version 2 may include minor changes, resulting in a relatively smalloverall delta for the backup version. Note that the delta may be withrespect to individual files (file level delta) or to the backup as awhole (filesystem level delta). This version may receive a reputation of60.

Again in version 3, minor changes are made and the backup versionreceives a reputation of 60.

In version 4, certain files are deleted. This triggers a high reputationof 100 because major changes have been made, resulting in a relativelylarge delta. Also, the deletion of certain files may be a specialtrigger that causes a longer-term preservation of the last or bestversion of that file. Version 4 is less likely to be deleted, and insome cases, to preserve a backup of the deleted file, this version isflagged for not being deleted at all or non-toll a threshold time haspassed.

Note that the versioning between versions 1, 2, 3, and 4 may be filelevel versions, or maybe file system level versions.

Backup chain 502 illustrates an example where a potential ransomwareattack occurs.

In version 1, as before, an initial backup is made with no changes. Thisversion is assigned a default reputation of 20.

In version 2, minor changes are made, resulting in a small delta. Thisversion receives a reputation of 40.

Version 3 again includes minor changes, resulting in a small delta, andinitially receives a reputation of 60.

However, in version 4, major changes are made. In this case, the numberand/or type of changes exceed a delta threshold that indicates apotential malware attack. Thus, this version receives a middlingreputation of 60. However, version 3 may now retroactively be assigned amuch higher reputation, such as 100, indicating that version 3 may be alast good version before the malware attack occurred.

Thus, if a user tries to access a file in version 4 and discovers thatransomware has encrypted the data in that version, then version 3 (whichhas a very high reputation) may be flagged as a likely candidate for thelast good version of those files. The user can then retrieve version 3,inspect the files, and upon determining that the files are the desiredgood files, restore the files from version 3. This has the advantage ofnot only preserving a good copy of the state of the files just beforethe ransomware attack occurred, but also of helping the user to easilyidentify that last good copy and restore it.

FIG. 6 is a flowchart of a method 600 according to one or more examplesof the present specification. Method 600 may be performed in certainembodiments by one or both of backup agent 224 of FIG. 2, backup server324 of FIG. 3, or any other hardware or software configuration thatprovides an intelligent backup engine.

For ease of reference, the operations of method 600 are said herein tobe performed by an intelligent backup engine. This intelligent backupengine should be broadly understood to include any hardware and/orsoftware elements that are configured or operable to perform thedisclose operations. Note that in certain embodiments, some of theoperations disclosed herein may be distributed such as between a clientand a server, or between different components on the same device,without affecting the integrity of the method.

In block 602, the intelligent backup agent initiates a new backup.

In decision block 604, the intelligent backup agent determines whetherthis is a new backup. For example, a new backup may be one in which datahave not been backed up before, so that an initial copy needs to be madeincluding a full backup of all data from the source. In other examples,changes may be substantial enough that it is desirable to initiate a newbackup, or it may be desirable to store a full version of the backupfrom time to time.

If the initiated backup is not an update, then in block 614 theintelligent backup agent stores the backup, and in block 699, the methodis done.

Returning to block 604, if this is an update of a previously storedbackup, then in block 606, the reputation engine retrieves thepreviously stored context and reputation for the prior version of thebackup. The prior context and reputation may help to inform the currentcontext and reputation. For example, the previous context and reputationmay negatively affect the current context and reputation if they havechanged substantially. Note that in some cases, the current context andreputation may also influence the previous context and reputation, suchas where the current backup is considered suspicious, and the previousbackup is promoted to ensure that it is retained longer.

In decision block 608, the intelligent backup engine determines whetherthe quota has been reached. Note that the quota may be associated with auser of the backup service, a device, a backup source, a backup object,or any other identifiable network object for which he quota may beassigned. If the quota has not been reached, then in block 614 thecurrent version is stored, and in block 699 the method is done.

Returning to block 608, if the quota has been reached, then theintelligent backup engine may need to identify a backup version that isa good candidate for being demoted as described herein. Note that insome embodiments, the process of identifying a best candidate fordemotion may be performed off-line such as during periods of low demand.In other embodiments, the decision may be made at the time the quota isreached. In this example, in block 610, the intelligent backup engineidentifies the least reputable available version of the backup object.Note that in certain embodiments, the current backup version may beexpressly excluded from being tagged as the least reputable version.This may be to avoid losing the most recent backup. In otherembodiments, the current backup version may be considered a potentialcandidate for least reputable status.

In block 612, the intelligent backup engine demotes the least reputablebackup version to free up room for the current backup version. Note thatin some embodiments, it may be necessary to remove multiple backupversions to make room for the current backup version. This is becausenot all backup versions are the same size (e.g., a smaller delta mayhave fewer overall data than a larger delta). In that case, theintelligent backup engine may identify a plurality of least reputablebackup versions and demote those to make room for the current backupversion.

In block 614, the intelligent backup engine stores the new backupversion to the backup target.

In block 699, the method is done.

FIG. 7 is a flowchart of a method 700 according to one or more examplesof the present specification. As with the example of method 600 of FIG.6, by way of concrete illustrative example, the operations of method 700are said to be performed by an intelligent backup engine. As before, theintelligent backup engine should be broadly construed and understood toencompass any suitable hardware and/or software configuration that mayconstitute an intelligent backup engine.

In block 702, the user operates a computer and stores data on itsstorage, constituting a backup source.

In block 704, the intelligent backup engine performs continuous orperiodic backups. This could include backing up a file every time achange in that file is committed to the disk, or it could includeregular nightly or other periodic backups that include all changes thathave been made since the last backup. Depending on the context, this maymean that backups are considered at the individual file level, or at thefile system level, as appropriate.

By way of illustration, a ransomware attack 706 may be directed towardthe device. In block 708, the ransomware attack is detected. This couldinclude the user trying to access a file and discovering that the filehas been compromised, or it could include a security agent running onthe computer scanning the file system and identifying the ransomwareattack according to a signature or other method.

In block 710, the user (or the computer, autonomously) requests thelatest good backup from a backup server or other backup destination.Note that in some cases, especially if the user has only just discoveredthe ransomware attack himself, he may not know when the last goodversion of a file should be found. In this case, it may be beneficialfor the intelligent backup engine to identify a previous backup with thehighest reputation, which may represent a last good version. The usermay then be directed to the last good version.

In block 714, the backup server sends the requested backup data to theuser, the intelligent backup engine restores the lost data.

In block 799, the method is done.

The foregoing outlines features of several embodiments so that thoseskilled in the art may better understand various aspects of the presentdisclosure. Those skilled in the art should appreciate that they mayreadily use the present disclosure as a basis for designing or modifyingother processes and structures for carrying out the same purposes and/orachieving the same advantages of the embodiments introduced herein.Those skilled in the art should also realize that such equivalentconstructions do not depart from the spirit and scope of the presentdisclosure, and that they may make various changes, substitutions, andalterations herein without departing from the spirit and scope of thepresent disclosure.

All or part of any hardware element disclosed herein may readily beprovided in a system-on-a-chip (SoC), including central processing unit(CPU) package. An SoC represents an integrated circuit (IC) thatintegrates components of a computer or other electronic system into asingle chip. Thus, for example, client devices 110 or server devices 300may be provided, in whole or in part, in an SoC. The SoC may containdigital, analog, mixed-signal, and radio frequency functions, all ofwhich may be provided on a single chip substrate. Other embodiments mayinclude a multi-chip-module (MCM), with a plurality of chips locatedwithin a single electronic package and configured to interact closelywith each other through the electronic package. In various otherembodiments, the computing functionalities disclosed herein may beimplemented in one or more silicon cores in application specificintegrated circuits (ASICs), field programmable gate arrays (FPGAs), andother semiconductor chips.

Note also that in certain embodiment, some of the components may beomitted or consolidated. In a general sense, the arrangements depictedin the figures may be more logical in their representations, whereas aphysical architecture may include various permutations, combinations,and/or hybrids of these elements. It is imperative to note thatcountless possible design configurations can be used to achieve theoperational objectives outlined herein. Accordingly, the associatedinfrastructure has a myriad of substitute arrangements, design choices,device possibilities, hardware configurations, software implementations,and equipment options.

In a general sense, any suitably-configured processor, such as processor210, can execute any type of instructions associated with the data toachieve the operations detailed herein. Any processor disclosed hereincould transform an element or an article (for example, data) from onestate or thing to another state or thing. In another example, someactivities outlined herein may be implemented with fixed logic orprogrammable logic (for example, software and/or computer instructionsexecuted by a processor) and the elements identified herein could besome type of a programmable processor, programmable digital logic (forexample, an FPGA, an erasable programmable ROM (EEPROM), an ASIC thatincludes digital logic, software, code, electronic instructions, flashmemory, optical disks, CD-ROMs, DVD ROMs, magnetic or optical cards,other types of machine-readable mediums suitable for storing electronicinstructions, or any suitable combination thereof.

In operation, a storage such as storage 250 may store information in anysuitable type of tangible, non-transitory storage medium (for example,RAM, ROM, FPGA, erasable programmable read only memory (EPROM),electrically erasable programmable ROM (EEPROM), etc.), software,hardware (for example, processor instructions or microcode), or in anyother suitable component, device, element, or object where appropriateand based on particular needs. Furthermore, the information beingtracked, sent, received, or stored in a processor could be provided inany database, register, table, cache, queue, control list, or storagestructure, based on particular needs and implementations, all of whichcould be referenced in any suitable timeframe. Any of the memory orstorage elements disclosed herein, such as memory 220 and storage 250,should be construed as being encompassed within the broad terms ‘memory’and ‘storage,’ as appropriate. A non-transitory storage medium herein isexpressly intended to include any non-transitory special-purpose orprogrammable hardware configured to provide the disclosed operations, orto cause a processor such as processor 210 to perform the disclosedoperations.

Computer program logic implementing all or part of the functionalitydescribed herein is embodied in various forms, including, but in no waylimited to, a source code form, a computer executable form, machineinstructions or microcode, programmable hardware, and variousintermediate forms (for example, forms generated by an assembler,compiler, linker, or locator). In an example, source code includes aseries of computer program instructions implemented in variousprogramming languages, such as an object code, an assembly language, ora high-level language such as OpenCL, FORTRAN, C, C++, JAVA, or HTML foruse with various operating systems or operating environments, or inhardware description languages such as Spice, Verilog, and VHDL. Thesource code may define and use various data structures and communicationmessages. The source code may be in a computer executable form (e.g.,via an interpreter), or the source code may be converted (e.g., via atranslator, assembler, or compiler) into a computer executable form, orconverted to an intermediate form such as byte code. Where appropriate,any of the foregoing may be used to build or describe appropriatediscrete or integrated circuits, whether sequential, combinatorial,state machines, or otherwise.

In one example embodiment, any number of electrical circuits of theFIGURES may be implemented on a board of an associated electronicdevice. The board can be a general circuit board that can hold variouscomponents of the internal electronic system of the electronic deviceand, further, provide connectors for other peripherals. Morespecifically, the board can provide the electrical connections by whichthe other components of the system can communicate electrically. Anysuitable processor and memory can be suitably coupled to the board basedon particular configuration needs, processing demands, and computingdesigns. Other components such as external storage, additional sensors,controllers for audio/video display, and peripheral devices may beattached to the board as plug-in cards, via cables, or integrated intothe board itself. In another example, the electrical circuits of theFIGURES may be implemented as standalone modules (e.g., a device withassociated components and circuitry configured to perform a specificapplication or function) or implemented as plug-in modules intoapplication specific hardware of electronic devices.

Note that with the numerous examples provided herein, interaction may bedescribed in terms of two, three, four, or more electrical components.However, this has been done for purposes of clarity and example only. Itshould be appreciated that the system can be consolidated orreconfigured in any suitable manner. Along similar design alternatives,any of the illustrated components, modules, and elements of the FIGURESmay be combined in various possible configurations, all of which arewithin the broad scope of this specification. In certain cases, it maybe easier to describe one or more of the functionalities of a given setof flows by only referencing a limited number of electrical elements. Itshould be appreciated that the electrical circuits of the FIGURES andits teachings are readily scalable and can accommodate a large number ofcomponents, as well as more complicated/sophisticated arrangements andconfigurations. Accordingly, the examples provided should not limit thescope or inhibit the broad teachings of the electrical circuits aspotentially applied to a myriad of other architectures.

Numerous other changes, substitutions, variations, alterations, andmodifications may be ascertained to one skilled in the art and it isintended that the present disclosure encompass all such changes,substitutions, variations, alterations, and modifications as fallingwithin the scope of the appended claims. In order to assist the UnitedStates Patent and Trademark Office (USPTO) and, additionally, anyreaders of any patent issued on this application in interpreting theclaims appended hereto, Applicant wishes to note that the Applicant: (a)does not intend any of the appended claims to invoke paragraph six (6)of 35 U.S.C. section 112 (pre-AIA) or paragraph (f) of the same section(post-AIA), as it exists on the date of the filing hereof unless thewords “means for” or “steps for” are specifically used in the particularclaims; and (b) does not intend, by any statement in the specification,to limit this disclosure in any way that is not otherwise expresslyreflected in the appended claims.

Example Implementations

There is disclosed in one example a computing apparatus, comprising: aprocessor and a memory; a network interface to communicatively couple toa backup client; a storage to receive backup data from the client,including a plurality of versions and an associated reputation for eachversion, the associated reputation to indicate a probability that theversion is valid; and instructions encoded within the memory to instructthe processor to: receive from the backup client a request to store anew version of the backup data; determine that the client has exceeded abackup threshold; identify a backup version having a lowest reputationfor validity; and expunge the backup version having the lowestreputation for validity.

There is further disclosed an example computing apparatus, wherein theinstructions are further to store the new version of the backup datawith an associated reputation.

There is further disclosed an example computing apparatus, wherein theinstructions are further to compute a reputation for the new version ofthe backup data.

There is further disclosed an example computing apparatus, whereincomputing the reputation for the new version of the backup datacomprises computing a delta between the new version and a previousversion.

There is further disclosed an example computing apparatus, wherein theprevious version is an immediate previous version.

There is further disclosed an example computing apparatus, wherein amagnitude of the delta is inversely proportional to the reputation.

There is further disclosed an example computing apparatus, whereincomputing the reputation includes the use of contextual data.

There is further disclosed an example computing apparatus, wherein thecontextual data include a data source.

There is further disclosed an example computing apparatus, wherein thecontextual data include a reputation of software that initiated changeto the data.

There is further disclosed an example computing apparatus, wherein thecontextual data include entropy compared to a reference entity for adocument or object type in the backup.

There is further disclosed an example computing apparatus, wherein thecontextual data include a writing pattern of the data.

There is further disclosed an example computing apparatus, wherein theinstructions are further to provide a machine learning model to computethe reputation.

There is also disclosed an example of one or more tangible,non-transitory computer-readable storage media having stored thereonexecutable instructions to instruct a processor to: allocate backup datafrom a client device to a backup store; associate with a plurality ofbackup versions in the backup store individual reputation scores, theindividual reputation scores representing a reputation for validity forthe backup versions; associate with the client a quota for backups;receive a new incoming backup from the client device; receive from theclient a new backup version; determine that new backup version exceedsthe client's quota for backups; identify within the backup store abackup version having a lowest reputation score; drop the backup versionhaving the lowest reputation score; and add the new incoming backupversion to the backup store.

There is further disclosed an example of one or more tangible,non-transitory computer-readable media, wherein the instructions arefurther to compute a reputation for the new version of the backup data,wherein the reputation comprises contextual data.

There is further disclosed an example of one or more tangible,non-transitory computer-readable media, wherein computing the reputationfor the new version of the backup data comprises computing a deltabetween the new version and a previous version.

There is further disclosed an example of one or more tangible,non-transitory computer-readable media, wherein the previous version isan immediate previous version.

There is further disclosed an example of one or more tangible,non-transitory computer-readable media, wherein a magnitude of the deltais inversely proportional to the reputation.

There is also disclosed an example computer-implemented method forremediating ransomware attacks on incremental backups, comprising:receiving a plurality of backup versions from a client device;associating with the backup versions individual reputations forvalidity; receiving a new incremental backup version that exceeds abackup quota for the client device; identifying from among the pluralityof backup versions a backup version with a lowest reputation forvalidity; removing the backup version with the lowest reputation forvalidity; storing the new incremental backup version; computing for thenew incremental backup version a new reputation for validity; andassociating the new reputation for validity with the new incrementalbackup version.

There is further disclosed an example method, wherein computing the newreputation for validity comprises computing a delta with an immediateprevious version, and assigning a reputation that varies inversely withthe delta.

There is further disclosed an example method, wherein computing the newreputation for validity comprises accounting for contextual data aboutthe new incremental backup.

What is claimed is:
 1. A computing apparatus, comprising: a processorand a memory; a network interface to communicatively couple to a backupclient; a storage to receive backup data from the client, including aplurality of versions and an associated reputation for each version, theassociated reputation to indicate a probability that the version isvalid; and instructions encoded within the memory to instruct theprocessor to: receive from the backup client a request to store a newversion of the backup data; determine that the client has exceeded abackup threshold; identify a backup version having a lowest reputationfor validity; and expunge the backup version having the lowestreputation for validity.
 2. The computing apparatus of claim 1, whereinthe instructions are further to store the new version of the backup datawith an associated reputation.
 3. The computing apparatus of claim 1,wherein the instructions are further to compute a reputation for the newversion of the backup data.
 4. The computing apparatus of claim 3,wherein computing the reputation for the new version of the backup datacomprises computing a delta between the new version and a previousversion.
 5. The computing apparatus of claim 4, wherein the previousversion is an immediate previous version.
 6. The computing apparatus ofclaim 4, wherein a magnitude of the delta is inversely proportional tothe reputation.
 7. The computing apparatus of claim 3, wherein computingthe reputation includes the use of contextual data.
 8. The computingapparatus of claim 7, wherein the contextual data include a data source.9. The computing apparatus of claim 7, wherein the contextual datainclude a reputation of software that initiated change to the data. 10.The computing apparatus of claim 7, wherein the contextual data includeentropy compared to a reference entity for a document or object type inthe backup.
 11. The computing apparatus of claim 7, wherein thecontextual data include a writing pattern of the data.
 12. The computingapparatus of claim 3, wherein the instructions are further to provide amachine learning model to compute the reputation.
 13. One or moretangible, non-transitory computer-readable storage media having storedthereon executable instructions to instruct a processor to: allocatebackup data from a client device to a backup store; associate with aplurality of backup versions in the backup store individual reputationscores, the individual reputation scores representing a reputation forvalidity for the backup versions; associate with the client a quota forbackups; receive a new incoming backup from the client device; receivefrom the client a new backup version; determine that new backup versionexceeds the client's quota for backups; identify within the backup storea backup version having a lowest reputation score; drop the backupversion having the lowest reputation score; and add the new incomingbackup version to the backup store.
 14. The one or more tangible,non-transitory computer-readable media of claim 13, wherein theinstructions are further to compute a reputation for the new version ofthe backup data, wherein the reputation comprises contextual data. 15.The one or more tangible, non-transitory computer-readable media ofclaim 14, wherein computing the reputation for the new version of thebackup data comprises computing a delta between the new version and aprevious version.
 16. The one or more tangible, non-transitorycomputer-readable media of claim 15, wherein the previous version is animmediate previous version.
 17. The one or more tangible, non-transitorycomputer-readable media of claim 14, wherein a magnitude of the delta isinversely proportional to the reputation.
 18. A computer-implementedmethod for remediating ransomware attacks on incremental backups,comprising: receiving a plurality of backup versions from a clientdevice; associating with the backup versions individual reputations forvalidity; receiving a new incremental backup version that exceeds abackup quota for the client device; identifying from among the pluralityof backup versions a backup version with a lowest reputation forvalidity; removing the backup version with the lowest reputation forvalidity; storing the new incremental backup version; computing for thenew incremental backup version a new reputation for validity; andassociating the new reputation for validity with the new incrementalbackup version.
 19. The method of claim 18, wherein computing the newreputation for validity comprises computing a delta with an immediateprevious version, and assigning a reputation that varies inversely withthe delta.
 20. The method of claim 18, wherein computing the newreputation for validity comprises accounting for contextual data aboutthe new incremental backup.